Data Deduplication System Based on Content-Defined Chunking Using Bytes Pair Frequency Occurrence
نویسندگان
چکیده
منابع مشابه
FastCDC: a Fast and Efficient Content-Defined Chunking Approach for Data Deduplication
Content-Defined Chunking (CDC) has been playing a key role in data deduplication systems in the past 15 years or so due to its high redundancy detection ability. However, existing CDC-based approaches introduce heavy CPU overhead because they declare the chunk cutpoints by computing and judging the rolling hashes of the data stream byte by byte. In this paper, we propose FastCDC, a Fast and eff...
متن کاملLeap-based Content Defined Chunking - Theory and Implementation
Content Defined Chunking (CDC) is an important component in data deduplication, which affects both the deduplication ratio as well as deduplication performance. The sliding-window-based CDC algorithm and its variants have been the most popular CDC algorithms for the last 15 years. However, their performance is limited in certain application scenarios since they have to slide byte by byte. The a...
متن کاملBimodal Content Defined Chunking for Backup Streams
Data deduplication has become a popular technology for reducing the amount of storage space necessary for backup and archival data. Content defined chunking (CDC) techniques are well established methods of separating a data stream into variable-size chunks such that duplicate content has a good chance of being discovered irrespective of its position in the data stream. Requirements for CDC incl...
متن کاملA Logistic Based Mathematical Model to Optimize Duplicate Elimination Ratio in Content Defined Chunking Based Big Data Storage System
Longxiang Wang 1, Xiaoshe Dong 1, Xingjun Zhang 1,*, Fuliang Guo 1, Yinfeng Wang 2 and Weifeng Gong 3 1 The School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China; [email protected] (L.W.); [email protected] (X.D.); [email protected] (F.G.) 2 The Shenzhen Institute of Information Technology, Shenzhen, 518172, China; wangyi...
متن کاملSystem Identification Based on Frequency Response Noisy Data
In this paper, a new algorithm for system identification based on frequency response is presented. In this method, given a set of magnitudes and phases of the system transfer function in a set of discrete frequencies, a system of linear equations is derived which has a unique and exact solution for the coefficients of the transfer function provided that the data is noise-free and the degrees of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Symmetry
سال: 2020
ISSN: 2073-8994
DOI: 10.3390/sym12111841